Representing an address space of unequal granularity and alignment

ABSTRACT

A computer-implemented method according to one embodiment includes identifying a data write to a specific position within a virtual address space, determining an entry within a metadata structure that corresponds to the specific position within the virtual address space, and adding state information associated with the data write to the entry within the metadata structure, the state information including a size of the data write within the virtual address space and an alignment of the data write within the virtual address space.

BACKGROUND

The present invention relates to data storage and random access memory(RAM), and more specifically, this invention relates to management of avirtual address space within a software management layer of a system.

In software systems, address spaces may be used to represent a placementof data. A virtual address space may require a software management layerto keep track of virtual address space usage, physical addressesassociated with virtual addresses, etc. However, current managementlayer implementations may take up a large amount of available metadatastorage and may not allow for direct access for virtual address spaceshaving unequal granularity and alignment.

SUMMARY

A computer-implemented method according to one embodiment includesidentifying a data write to a specific position within a virtual addressspace, determining an entry within a metadata structure that correspondsto the specific position within the virtual address space, and addingstate information associated with the data write to the entry within themetadata structure, the state information including a size of the datawrite within the virtual address space and an alignment of the datawrite within the virtual address space.

According to another embodiment, a computer program product forrepresenting an address space of unequal granularity and alignmentcomprises a computer readable storage medium having program instructionsembodied therewith, where the computer readable storage medium is not atransitory signal per se, and where the program instructions areexecutable by a processor to cause the processor to perform a methodcomprising identifying a data write to a specific position within avirtual address space, utilizing the processor, determining an entrywithin a metadata structure that corresponds to the specific positionwithin the virtual address space, utilizing the processor, and adding,utilizing the processor, state information associated with the datawrite to the entry within the metadata structure, the state informationincluding a size of the data write within the virtual address space andan alignment of the data write within the virtual address space.

A system according to another embodiment comprises a processor, andlogic integrated with the processor, executable by the processor, orintegrated with and executable by the processor, where the logic isconfigured to identify a data write to a specific position within avirtual address space, determine an entry within a metadata structurethat corresponds to the specific position within the virtual addressspace, and add state information associated with the data write to theentry within the metadata structure, the state information including asize of the data write within the virtual address space and an alignmentof the data write within the virtual address space.

Other aspects and embodiments of the present invention will becomeapparent from the following detailed description, which, when taken inconjunction with the drawings, illustrate by way of example theprinciples of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a network architecture, in accordance with oneembodiment.

FIG. 2 shows a representative hardware environment that may beassociated with the servers and/or clients of FIG. 1, in accordance withone embodiment.

FIG. 3 illustrates a tiered data storage system in accordance with oneembodiment.

FIG. 4 illustrates a method for representing an address space of unequalgranularity and alignment, in accordance with one embodiment.

FIG. 5 illustrates a first scenario illustrating interleaved 4 KBentries and 8 KB entries, as well as a second scenario illustratingnon-interleaved 8 KB entries, in accordance with one embodiment.

FIG. 6 illustrates an exemplary address space where an 8 KB write to aposition of 4 KB within the address space leads to an overwrite of twoexisting entries, in accordance with one embodiment.

FIG. 7 illustrates an exemplary address space where an 8 KB write to aposition of 8 KB within the address space leads to a merge, inaccordance with one embodiment.

FIG. 8 illustrates an exemplary chart including a list of entries withina metadata structure that correspond to a consistent alignment within acorresponding address space, in accordance with one embodiment.

FIG. 9 illustrates an exemplary chart of entries within a metadatastructure that correspond to a varying alignment within a correspondingaddress space, in accordance with one embodiment.

DETAILED DESCRIPTION

The following description discloses several preferred embodiments ofsystems, methods and computer program products for representing anaddress space of unequal granularity and alignment. Various embodimentsprovide a method to identify a data write to a virtual address space,determine a metadata structure entry corresponding to a position of thedata write, and add state information associated with the data write tothe metadata structure entry.

The following description is made for the purpose of illustrating thegeneral principles of the present invention and is not meant to limitthe inventive concepts claimed herein. Further, particular featuresdescribed herein can be used in combination with other describedfeatures in each of the various possible combinations and permutations.

Unless otherwise specifically defined herein, all terms are to be giventheir broadest possible interpretation including meanings implied fromthe specification as well as meanings understood by those skilled in theart and/or as defined in dictionaries, treatises, etc.

It must also be noted that, as used in the specification and theappended claims, the singular forms “a,” “an” and “the” include pluralreferents unless otherwise specified. It will be further understood thatthe terms “includes” and/or “comprising,” when used in thisspecification, specify the presence of stated features, integers, steps,operations, elements, and/or components, but do not preclude thepresence or addition of one or more other features, integers, steps,operations, elements, components, and/or groups thereof.

The following description discloses several preferred embodiments ofsystems, methods and computer program products for representing anaddress space of unequal granularity and alignment.

In one general embodiment, a computer-implemented method includesidentifying a data write to a specific position within a virtual addressspace, determining an entry within a metadata structure that correspondsto the specific position within the virtual address space, and addingstate information associated with the data write to the entry within themetadata structure, the state information including a size of the datawrite within the virtual address space and an alignment of the datawrite within the virtual address space.

In another general embodiment, a computer program product forrepresenting an address space of unequal granularity and alignmentcomprises a computer readable storage medium having program instructionsembodied therewith, where the computer readable storage medium is not atransitory signal per se, and where the program instructions areexecutable by a processor to cause the processor to perform a methodcomprising identifying a data write to a specific position within avirtual address space, utilizing the processor, determining an entrywithin a metadata structure that corresponds to the specific positionwithin the virtual address space, utilizing the processor, and adding,utilizing the processor, state information associated with the datawrite to the entry within the metadata structure, the state informationincluding a size of the data write within the virtual address space andan alignment of the data write within the virtual address space.

In another general embodiment, a system comprises a processor, and logicintegrated with the processor, executable by the processor, orintegrated with and executable by the processor, where the logic isconfigured to identify a data write to a specific position within avirtual address space, determine an entry within a metadata structurethat corresponds to the specific position within the virtual addressspace, and add state information associated with the data write to theentry within the metadata structure, the state information including asize of the data write within the virtual address space and an alignmentof the data write within the virtual address space.

FIG. 1 illustrates an architecture 100, in accordance with oneembodiment. As shown in FIG. 1, a plurality of remote networks 102 areprovided including a first remote network 104 and a second remotenetwork 106. A gateway 101 may be coupled between the remote networks102 and a proximate network 108. In the context of the presentarchitecture 100, the networks 104, 106 may each take any formincluding, but not limited to a LAN, a WAN such as the Internet, publicswitched telephone network (PSTN), internal telephone network, etc.

In use, the gateway 101 serves as an entrance point from the remotenetworks 102 to the proximate network 108. As such, the gateway 101 mayfunction as a router, which is capable of directing a given packet ofdata that arrives at the gateway 101, and a switch, which furnishes theactual path in and out of the gateway 101 for a given packet.

Further included is at least one data server 114 coupled to theproximate network 108, and which is accessible from the remote networks102 via the gateway 101. It should be noted that the data server(s) 114may include any type of computing device/groupware. Coupled to each dataserver 114 is a plurality of user devices 116. User devices 116 may alsobe connected directly through one of the networks 104, 106, 108. Suchuser devices 116 may include a desktop computer, lap-top computer,hand-held computer, printer or any other type of logic. It should benoted that a user device 111 may also be directly coupled to any of thenetworks, in one embodiment.

A peripheral 120 or series of peripherals 120, e.g., facsimile machines,printers, networked and/or local storage units or systems, etc., may becoupled to one or more of the networks 104, 106, 108. It should be notedthat databases and/or additional components may be utilized with, orintegrated into, any type of network element coupled to the networks104, 106, 108. In the context of the present description, a networkelement may refer to any component of a network.

According to some approaches, methods and systems described herein maybe implemented with and/or on virtual systems and/or systems whichemulate one or more other systems, such as a UNIX system which emulatesan IBM z/OS environment, a UNIX system which virtually hosts a MICROSOFTWINDOWS environment, a MICROSOFT WINDOWS system which emulates an IBMz/OS environment, etc. This virtualization and/or emulation may beenhanced through the use of VMWARE software, in some embodiments.

In more approaches, one or more networks 104, 106, 108, may represent acluster of systems commonly referred to as a “cloud.” In cloudcomputing, shared resources, such as processing power, peripherals,software, data, servers, etc., are provided to any system in the cloudin an on-demand relationship, thereby allowing access and distributionof services across many computing systems. Cloud computing typicallyinvolves an Internet connection between the systems operating in thecloud, but other techniques of connecting the systems may also be used.

FIG. 2 shows a representative hardware environment associated with auser device 116 and/or server 114 of FIG. 1, in accordance with oneembodiment. Such figure illustrates a typical hardware configuration ofa workstation having a central processing unit 210, such as amicroprocessor, and a number of other units interconnected via a systembus 212.

The workstation shown in FIG. 2 includes a Random Access Memory (RAM)214, Read Only Memory (ROM) 216, an I/O adapter 218 for connectingperipheral devices such as disk storage units 220 to the bus 212, a userinterface adapter 222 for connecting a keyboard 224, a mouse 226, aspeaker 228, a microphone 232, and/or other user interface devices suchas a touch screen and a digital camera (not shown) to the bus 212,communication adapter 234 for connecting the workstation to acommunication network 235 (e.g., a data processing network) and adisplay adapter 236 for connecting the bus 212 to a display device 238.

The workstation may have resident thereon an operating system such asthe Microsoft Windows® Operating System (OS), a MAC OS, a UNIX OS, etc.It will be appreciated that a preferred embodiment may also beimplemented on platforms and operating systems other than thosementioned. A preferred embodiment may be written using XML, C, and/orC++ language, or other programming languages, along with an objectoriented programming methodology. Object oriented programming (OOP),which has become increasingly used to develop complex applications, maybe used.

Now referring to FIG. 3, a storage system 300 is shown according to oneembodiment. Note that some of the elements shown in FIG. 3 may beimplemented as hardware and/or software, according to variousembodiments. The storage system 300 may include a storage system manager312 for communicating with a plurality of media on at least one higherstorage tier 302 and at least one lower storage tier 306. The higherstorage tier(s) 302 preferably may include one or more random accessand/or direct access media 304, such as hard disks in hard disk drives(HDDs), nonvolatile memory (NVM), solid state memory in solid statedrives (SSDs), flash memory, SSD arrays, flash memory arrays, etc.,and/or others noted herein or known in the art. The lower storagetier(s) 306 may preferably include one or more lower performing storagemedia 308, including sequential access media such as magnetic tape intape drives and/or optical media, slower accessing HDDs, sloweraccessing SSDs, etc., and/or others noted herein or known in the art.One or more additional storage tiers 316 may include any combination ofstorage memory media as desired by a designer of the system 300. Also,any of the higher storage tiers 302 and/or the lower storage tiers 306may include some combination of storage devices and/or storage media.

The storage system manager 312 may communicate with the storage media304, 308 on the higher storage tier(s) 302 and lower storage tier(s) 306through a network 310, such as a storage area network (SAN), as shown inFIG. 3, or some other suitable network type. The storage system manager312 may also communicate with one or more host systems (not shown)through a host interface 314, which may or may not be a part of thestorage system manager 312. The storage system manager 312 and/or anyother component of the storage system 300 may be implemented in hardwareand/or software, and may make use of a processor (not shown) forexecuting commands of a type known in the art, such as a centralprocessing unit (CPU), a field programmable gate array (FPGA), anapplication specific integrated circuit (ASIC), etc. Of course, anyarrangement of a storage system may be used, as will be apparent tothose of skill in the art upon reading the present description.

In more embodiments, the storage system 300 may include any number ofdata storage tiers, and may include the same or different storage memorymedia within each storage tier. For example, each data storage tier mayinclude the same type of storage memory media, such as HDDs, SSDs,sequential access media (tape in tape drives, optical disk in opticaldisk drives, etc.), direct access media (CD-ROM, DVD-ROM, etc.), or anycombination of media storage types. In one such configuration, a higherstorage tier 302, may include a majority of SSD storage media forstoring data in a higher performing storage environment, and remainingstorage tiers, including lower storage tier 306 and additional storagetiers 316 may include any combination of SSDs, HDDs, tape drives, etc.,for storing data in a lower performing storage environment. In this way,more frequently accessed data, data having a higher priority, dataneeding to be accessed more quickly, etc., may be stored to the higherstorage tier 302, while data not having one of these attributes may bestored to the additional storage tiers 316, including lower storage tier306. Of course, one of skill in the art, upon reading the presentdescriptions, may devise many other combinations of storage media typesto implement into different storage schemes, according to theembodiments presented herein.

According to some embodiments, the storage system (such as 300) mayinclude logic configured to receive a request to open a data set, logicconfigured to determine if the requested data set is stored to a lowerstorage tier 306 of a tiered data storage system 300 in multipleassociated portions, logic configured to move each associated portion ofthe requested data set to a higher storage tier 302 of the tiered datastorage system 300, and logic configured to assemble the requested dataset on the higher storage tier 302 of the tiered data storage system 300from the associated portions.

Of course, this logic may be implemented as a method on any deviceand/or system or as a computer program product, according to variousembodiments.

Now referring to FIG. 4, a flowchart of a method 400 is shown accordingto one embodiment. The method 400 may be performed in accordance withthe present invention in any of the environments depicted in FIGS. 1-3,among others, in various embodiments. Of course, more or less operationsthan those specifically described in FIG. 4 may be included in method400, as would be understood by one of skill in the art upon reading thepresent descriptions.

Each of the steps of the method 400 may be performed by any suitablecomponent of the operating environment. For example, in variousembodiments, the method 400 may be partially or entirely performed byone or more servers, computers, or some other device having one or moreprocessors therein. The processor, e.g., processing circuit(s), chip(s),and/or module(s) implemented in hardware and/or software, and preferablyhaving at least one hardware component may be utilized in any device toperform one or more steps of the method 400. Illustrative processorsinclude, but are not limited to, a central processing unit (CPU), anapplication specific integrated circuit (ASIC), a field programmablegate array (FPGA), etc., combinations thereof, or any other suitablecomputing device known in the art.

As shown in FIG. 4, method 400 may initiate with operation 402, where adata write to a specific position within a virtual address space isidentified. In one embodiment, the data write may include the writing ofdata to the specific position within the virtual address space. Forexample, a user or application may write a portion of data having apredetermined size (e.g., an 8 KB portion of data, etc.) to storagespace at a predetermined location/offset (e.g., at an offset of 16 KBwithin the storage space, etc.).

Additionally, in one embodiment, the data write may be performed by auser, by an application, etc. In another embodiment, the virtual addressspace may include a range of addresses that correspond to a virtualstorage space. For example, the virtual storage space may include aportion of a virtual data drive. In this way, the virtual address spacemay represent locations within virtual memory where data is stored.

Further, in one embodiment, the virtual address space may be assigned toone or more of a user, an application, etc. In another embodiment, thevirtual address space may be assigned by an operating system of a system(e.g., a computing device such as a server, etc.). In yet anotherembodiment, the virtual address space may have an alignment granularity.

For example, reads and writes within the virtual address space are madeaccording to the alignment granularity. In another example, all datastored within the storage space may have a starting point within thevirtual address space that is a multiple of the alignment granularity.In yet another example, if the alignment granularity is 4 KB, datastored within the storage space may have a starting point of 0 KB, 4 KB,8 KB, 12 KB, 16 KB, etc. within the virtual address space.

Further still, in one embodiment, the virtual address space may have asize granularity. For example, the size granularity may indicate a sizeof reads and writes that are made to the storage space. In anotherexample, all data read or written to the storage space may have a sizethat is a multiple of the size granularity. In yet another example, ifthe size granularity is 8 KB, data stored within the storage space mayhave a size of 8 KB, 16 KB, 24 KB, etc. Of course, however, other sizesare supported but may require a read to align the write.

Also, in one embodiment, the size granularity may be twice the alignmentgranularity. For example, the size granularity may be 8 KB, and thealignment granularity may be 4 KB. This may result in an address spaceof unequal granularity and alignment.

In addition, method 400 may proceed with operation 404, where an entrywithin a metadata structure that corresponds to the specific positionwithin the virtual address space is determined. In one embodiment, themetadata structure may represent the virtual address space at amanagement layer of a system. For example, the virtual address space mayrefer to locations within a storage layer of a system. In anotherexample, the metadata structure may be incorporated within a softwaremanagement layer of a system. In another embodiment, the metadatastructure may link the virtual address space to a physical addressspace. For example, each entry within the metadata structure may link avirtual address space of a data write to a physical address space wherethe data associated with the write is physically located. In yet anotherembodiment, the metadata structure may also contain additional metadataabout the data associated with the write, one or more pointers to othermetadata structures that contain information about the data associatedwith the write, etc.

Furthermore, in one embodiment, the metadata structure may include anarray. In another embodiment, the metadata structure may allow fordirect access to the data in the virtual storage space from themanagement layer. In yet another embodiment, the metadata structure maybe stored separately from the data.

Further still, in one embodiment, the metadata structure may include aplurality of entries. For example, one or more entries within themetadata structure may each correspond to a position within the virtualaddress space where data is stored. In another example, each entrywithin the metadata structure may correspond to a grain of the addressspace (e.g., a portion of the virtual address space having a sizematching the size granularity of the virtual address space, etc.).

For instance, if the address space has a size granularity of 8 KB, eachentry within the metadata structure may correspond to 8 KB within theaddress space. In this way, the address space may correspond to a sizegranularity of the virtual address space.

Also, in one embodiment, each entry within the metadata structure mayinclude an indication as to whether the corresponding grain is in use, aphysical address of the grain, etc. In another embodiment, the metadatastructure may have a fixed number of entries. In yet another embodiment,each entry may include an entry index number (e.g., an integer, etc.)within the metadata structure.

Additionally, in one embodiment, the entry within the metadata structuremay be determined utilizing one or more equations. For example, for asize/alignment granularity with a factor of 2 (e.g., an alignmentgranularity of X and a size granularity of 2X), and for a specificposition pos within the virtual address space, the entry index number imay be found as follows: i=rounddown((pos+0.5X)/(1.5X)). Morespecifically, with an alignment granularity of 4 KB, and a specificposition pos within the virtual address space, the entry index number imay be found as follows: i=rounddown((pos+2 KB)/6 KB), where “rounddown”rounds down to the nearest integer. In another example, if an 8 KBportion of data is written to the storage space at a location of 16 KBwithin the virtual address space, the entry index number may be:rounddown (16 KB+2 KB)/6 KB)=3. In another embodiment, when reading thedata, two positions may be checked where the two cell indices of thepositions may be calculated as follows: Entry 1=(pos−2 KB)/6 KB, andEntry 2=(pos+2 KB)/6 KB.

Further, method 400 may proceed with operation 406, where stateinformation associated with the data write is added to the entry withinthe metadata structure, the state information including a size of thedata write within the virtual address space and an alignment of the datawrite within the virtual address space. In one embodiment, metadatadescribing the data write may be added to the entry. In anotherembodiment, a location of the data write may be added to the entry. Forexample, the location of the data write may include a location in aphysical address space where the data is written.

Further still, in one embodiment, the state information may be added astwo bits within the entry in the metadata structure. For example, thetwo bits may indicate a size of the entry, and whether the entry isaligned left or right within the portion of virtual address spacerepresented by the entry within the metadata structure. In anotherembodiment, the state information may be adjusted to account for anumbering of the entry within the metadata structure.

Also, in one embodiment, the size and the alignment of the data writemay be associated with a numbering of the entry. For example, an entryhaving an odd entry number may have four possible odd entry states, andan entry having an even entry number may have three possible even entrystates different from the four possible odd entry states. In anotherexample, the state information stored within an entry may reflect thenumbering of the entry, and may be analyzed in association with thenumbering of the entry in order to determine the size and the alignmentof the data write within the virtual address space.

In addition, in one embodiment, the alignment may include a leftalignment or a right alignment. For example, the left alignment mayinclude a lower address the entry may contain. In another example, theright alignment may include a higher address the entry may contain. Inyet another example, a difference between the left alignment and theright alignment may be the alignment granularity (e.g., 4 KB, etc.).

Furthermore, in one embodiment, one or more additional entries may beadjusted within the metadata structure, based on the data write. Forexample, a state of one or more neighbor entries to the determined entrymay be identified. In another example, if the data write affects anaddress space represented by a neighbor entry, the neighbor entry may bemodified within the metadata structure. For instance, if the data writeaffects a previous alignment within the virtual address space, aplurality of entries may be modified within the metadata structure.

Further still, in one embodiment, the metadata structure may besubdivided into a plurality of self-contained groups. In anotherembodiment, each self-contained group may not have any entry carryoverinto adjacent groups (e.g., all entries within the self-contained groupmay start and end within the self-contained group). This may improve apaging performance of the metadata structure, such that the metadatastructure may be simply divided into one or more pages.

Also, in one embodiment, the entry within the metadata structure may beused to identify the specific position within the virtual address space,and to translate the specific position within the virtual address spaceto a location within a physical address space (e.g., of physicalstorage, etc.) where data is stored. The entry within the metadatastructure may also be used to identify and translate a location within aphysical address space (e.g., of physical storage, etc.) where data isstored to a corresponding specific position within the virtual addressspace.

For example, a request may be received at the management layer of thesystem, where the request indicates a specific position within aphysical address space. In response to the request, an entry within themetadata structure that contains the specific position within thephysical address space may be located. Additionally, a specific positionwithin the virtual address space may be determined, based on theinformation stored within the entry, where the information includes thestate information.

In this way, the metadata structure may be used to access data withinthe virtual address space as well as the physical address space, as wellas to link a virtual data location within the virtual address space to aphysical data location within the physical address space. Also, themetadata structure may be used to manage both the virtual address spaceas well as the physical address space. For example, entries within themetadata structure may be used to determine whether correspondingvirtual address space locations are in use, as well to determinephysical storage locations that correspond to virtual address spacelocations. This may enable both virtual and physical data managementwithin the system, utilizing the metadata structure.

Additionally, a number of entries within the metadata structure that areneeded to represent an address space of unequal granularity andalignment within a management layer of a system may be reduced. Forexample, if the size granularity of the virtual address space is 8 KB,and the alignment granularity of the virtual address space is 4 KB, anaverage number of entries per 12 KB of address space may be reduced fromthree to two when compared to solutions that have entries correspondingto an alignment granularity of the virtual address space.

More specifically, by incorporating state information into entrieswithin the metadata structure, each entry within the metadata structuremay correspond to a size granularity of the address space, instead of analignment granularity of the address space. As a result, half as manyentries may be needed within the metadata structure (when compared toentries corresponding to an alignment granularity) when the sizegranularity of the address space is twice the size of the alignmentgranularity of the address space.

This may also decrease a size of the metadata structure stored within amanagement layer of a system, which may increase an amount of availablestorage space within the system, which may in turn increase aperformance of the system (e.g., since the additional available storagespace may be used for other management duties, etc.). Additionally, asize of the metadata structure may be static for a predetermined addressspace. This may eliminate a need for one or more allocations associatedwith the metadata structure.

A Method for Representing an Address Space of Unequal Granularity andAlignment

Introduction

In software systems, address spaces may be used to represent a placementof data. This may be used for RAM allocated to a process, the disk spaceof a file system, the allocated space of a volume, etc.

In one embodiment, the address space may have a minimal grain size. Forexample, the grain in a block device may be 512 bytes, and a grain sizeof a flash drive may be 4 KB. The address space may also be virtual,such as with virtual RAM or space efficient block storage. A virtualaddress space may require a software management layer that contains anentry per grain of address space. The entry may contain information suchas an indication as to whether the grain is in use, a physical addressof the grain, etc.

When managing large address spaces, an amount of metadata in themanagement layer may become an issue. The smaller the grain size is, themore metadata may be required to represent the address space.

Now consider an address space with a grain size of 8 KB, meaning readsand writes to the address space may be a multiple of 8 KB. It would bestraightforward to have a metadata entry per 8 KB grain. However,further consider that an alignment of the address space may be 4 KB, sothat writes may or may not be aligned to 8 KB. It may no longer bepossible to have a metadata entry per 8 KB because writes at 4 KBalignment may create entries of 4 KB. This may be considered an addressspace of unequal granularity and alignment.

It is important to note that the 8 KB size granularity and 4 KBalignment granularity are used solely for purposes of example, and arenot to be construed as limiting in any way. The described implementationmay be applied to any size granularity that is two times the alignmentgranularity, and may be extended to other ratios as well.

One solution may hold an entry in the metadata structure for each 4 KBof address space. However, this may cost twice the amount of metadata inrelation to an entry per 8 KB of address space. This metadata may bestored in an array, thereby providing efficient memory utilization andsuperior performance due to direct access.

Assuming it is possible to combine two adjacent entries of 4 KB into asingle entry of 8 KB, an implementation is provided for storing theaddress space in a metadata structure using 50% less metadata than theaforementioned 4 KB solution requires, while retaining the robustperformance of direct access.

Summary

In one embodiment, the implementation may be based on the fact that itis possible to merge two neighboring entries of 4 KB (8 KB entries maynot be touched). This makes it clear that the case that requires thehighest number of entry modifications within the metadata structure iswhen 4 KB and 8 KB entries are interleaved. FIG. 5 illustrates a firstscenario 502 illustrating interleaved 4 KB entries 506 and 8 KB entries508, as well as a second scenario 504 illustrating non-interleaved 8 KBentries 508.

One solution may be to use a dynamic sized structure since the number ofentries is dynamic. However, this may reduce a performance of themanagement layer of the system. For example, dynamic allocations may berequired, and access may not be direct. These issues may be overcome byutilizing a metadata structure that has a predefined size, providesdirect access, and uses less memory than a 4 KB solution.

In one embodiment, the current implementation may average 2 entries per12 KB. The size of each entry may be either 4 KB or 8 KB. Entries mayalso be left unused. Additionally, an entry may be either aligned to 4Kb or to 8 KB. This may require two state bits per entry that representthe following states of an entry:

-   -   8 KB aligned left    -   4 KB aligned left or right    -   8 KB or 4 KB aligned right    -   Unused

In one embodiment, the left alignment may include the lower address theentry may contain, and the right alignment may include the higheraddress the entry may contain. In this example, the right alignment mayalways be 4 KB more than the left alignment. States 2 and 3 may differbetween even and odd entries.

In one embodiment, a direct outcome of the above is that an address mayhave only two possible entries it can be placed in. This may provideperformance equivalent to an array. It will also be shown howneighboring entries interact and how the entries can be stored inself-contained pages.

DESCRIPTION

In one embodiment, the term pos may include a position of a read/writewithin the address space, where the position corresponds to a nalignment granularity of the address space. For example, if thealignment granularity is 4 KB, pos may be the 4 KB aligned position of aread/write within the address space. This term will be used in formulasbelow.

Writing to an Address Space

When writing to the address space, the index number of an entry withinthe metadata structure to use to represent pos may be found using thefollowing formula:

Entry index=rounddown((pos+2 KB)/6 KB).

Once the target entry is determined by the above formula, the state ofthe neighboring metadata structure entries may be checked because awrite to pos might overwrite content of an adjacent entry within themetadata structure. Furthermore, the write to pos might lead to twoconsecutive 4 KB entries that may be merged.

FIG. 6 illustrates an exemplary address space 600 where an 8 KB write602 to a position of 4 KB within the address space 600 leads to anoverwrite of two existing entries representing locations 604 (from 0 KBto 8 KB) and 606 (from 8 KB to 16 KB) within the address space 600.

FIG. 7 illustrates an exemplary address space 700 where an 8 KB write702 to a position of 8 KB within the address space 700 leads to a mergeof two entries representing a first location 704 (from 0 KB to 4 KB) anda second location 706 (from 4 KB to 8 KB) within the address space 700,to create a merged entry 708 (from 0 KB to 8 KB). It should be notedthat the two entries representing the first location 704 (from 0 KB to 4KB) and the second location 706 (from 4 KB to 8 KB) within the addressspace 700 prior to the creation of the merged entry 708 (from 0 KB to 8KB) are included within an intermediate state, and are shown forpurposes of example only.

In one embodiment, a cost of a write to the address space may bedetermined by an alignment of the write compared to the alignment of thedata previously written within the address space. For example, the costof a write may include a modification of one entry, if the existingalignment is retained. In another example, the cost of the write mayinclude a modification of two entries (see, for Example, FIG. 6). In yetanother example, the cost of the write may include a modification ofthree entries (see, for Example, FIG. 7). If a merge is required on bothends within the address space, up to four entries may be modified.

FIG. 8 illustrates an exemplary chart 800 including a list of entries802 within a metadata structure that correspond to a consistentalignment 804 within a corresponding address space. In one embodiment,the consistent alignment 804 may be created utilizing writes that arealigned with a size granularity of the address space. The actual content806 stored within the entries 802 is included, along with possiblecontent 808 (e.g., content that could possibly be stored within theentries 802) and empty space 810 (e.g., locations within the entries 802where no content is stored).

FIG. 9 illustrates an exemplary chart 900 of entries 902 within ametadata structure that correspond to a varying alignment 904 within acorresponding address space. In one embodiment, the varying alignment904 may be created utilizing writes that meet an alignment granularityof the address space, but are not aligned with a size granularity of theaddress space. The actual content 906 stored within the entries 902 isincluded, along with possible content 908 (e.g., content that couldpossibly be stored within the entries 902) and empty space 910 (e.g.,locations within the entries 902 where no content is stored).

FIGS. 8 and 9 detail what each entry can contain and demonstrate howwrites at different alignments are stored in the structure. Thisillustrates how a static metadata structure may implement dynamicbehavior and adjust for both a consistent alignment 804 as shown in FIG.8 and a varying alignment 904 as shown in FIG. 9.

As shown in FIGS. 8 and 9, the potential content may differ between evenand odd entry indices within the metadata structures. For example,entries with an even index may have four possible states, whereasentries with an odd index may have three possible states. Table 1illustrates exemplary states of entries within a metadata structure, inaccordance with one embodiment. Of course, it should be noted that theexemplary states shown in Table 1 are set forth for illustrativepurposes only, and thus should not be construed as limiting in anymanner.

TABLE 1 State Even entry index Odd entry index 1 8 KB aligned left 8 KBaligned left 2 4 KB aligned left 4 KB aligned right 3 4 KB aligned right8 KB aligned right 4 Entry is unused —

In one embodiment, a state of the entry, including its size andalignment within the address space, may be described utilizing apredetermined portion of the entry within the metadata structure (e.g.,two bits, etc.). This predetermined portion may be analyzed inassociation with the entry number (even or odd) to determine the sizeand alignment of the entry within the address space.

Table 2 illustrates the possible alignments for an entry with an index iwithin a metadata structure, in accordance with one embodiment. Ofcourse, it should be noted that the possible alignments shown in Table 2are set forth for illustrative purposes only, and thus should not beconstrued as limiting in any manner.

TABLE 2 For an entry with an index i, the two possible alignments in maycontain are: Alignment i is even i is odd Left pos = 6 * i pos = 6 * i −2 len = 4 or 8 len = 8 Right pos = 6 * i + 4 pos = 6 * i + 2 len = 4 len= 4 or 8

Reading from an Address Space

In one embodiment, an entry may be determined for a read that is alignedto a predetermined granularity (in this exemplary case, 4 KB). Inanother embodiment, a single position may be located in one of twoentries. For example, when reading from pos=4 KB, the entry may be inentry 0 as the end of an 8 KB write to pos=0 KB, or it may be in entry 1as the result of a write to pos=4.

The two cell indices of a position pos may be calculated as follows:

Entry 1=(pos−2 KB)/6 KB

Entry 2=(pos+2 KB)/6 KB.

Subdividing the Metadata into Self-Contained Groups

In one embodiment, it may be possible to divide the metadata intoself-contained groups. A self-contained group may not have a carryoverinto its first entry from a previous group, and it may not have acarryover from its last entry into a following group. This may provideefficient paging of the metadata within the metadata structure. Thesegroups may also allow for two consecutive 4 KB chunks, where one is thelast of a first group and the second is the first of a following group(assuming an alignment granularity of 4 KB).

The amount of address space covered by a group may be calculated asfollows:

coverage=6 KB*n rounded down to nearest 4 KB,

where n is the number of entries in the group.

In this way, an amount of space needed by the metadata structure torepresent the address space may be reduced. For example, assuming a sizegranularity of 8 KB and an alignment granularity of 4 KB, the metadatastructure may only require two entries per 12 KB of address space (e.g.,instead three per 12 KB of address space, as required when using ametadata structure having entries corresponding to an alignmentgranularity of the address space).

Additionally, the size of the metadata structure may be static, as wellas the number of entries within the metadata structure, and therefore noallocations may be required. Further, finding an entry within themetadata structure that contains any address within the address spacemay be direct (e.g., in the 4 KB alignment granularity example, theentry may be found in up to two entries).

Further still, determining the entry within the metadata structure towrite any position to is done utilizing a simple and inexpensiveformula. Also, there may be no cache miss when looking at the secondentry. In addition, no merge penalty may exist if an alignment isconsistent. Furthermore, 8 KB chunks may not be affected—this may beimportant if, for example, the address space undergoes deduplicationthat typically takes place at 8 KB. Further still, the metadatastructure may simply and efficiently be divided into pages. Also, thecost may be only two bits per entry for state, and a merge penalty mayexist only if an alignment keeps changing.

The present invention may be a system, a method, and/or a computerprogram product. The computer program product may include a computerreadable storage medium (or media) having computer readable programinstructions thereon for causing a processor to carry out aspects of thepresent invention.

The computer readable storage medium can be a tangible device that canretain and store instructions for use by an instruction executiondevice. The computer readable storage medium may be, for example, but isnot limited to, an electronic storage device, a magnetic storage device,an optical storage device, an electromagnetic storage device, asemiconductor storage device, or any suitable combination of theforegoing. A non-exhaustive list of more specific examples of thecomputer readable storage medium includes the following: a portablecomputer diskette, a hard disk, a random access memory (RAM), aread-only memory (ROM), an erasable programmable read-only memory (EPROMor Flash memory), a static random access memory (SRAM), a portablecompact disc read-only memory (CD-ROM), a digital versatile disk (DVD),a memory stick, a floppy disk, a mechanically encoded device such aspunch-cards or raised structures in a groove having instructionsrecorded thereon, and any suitable combination of the foregoing. Acomputer readable storage medium, as used herein, is not to be construedas being transitory signals per se, such as radio waves or other freelypropagating electromagnetic waves, electromagnetic waves propagatingthrough a waveguide or other transmission media (e.g., light pulsespassing through a fiber-optic cable), or electrical signals transmittedthrough a wire.

Computer readable program instructions described herein can bedownloaded to respective computing/processing devices from a computerreadable storage medium or to an external computer or external storagedevice via a network, for example, the Internet, a local area network, awide area network and/or a wireless network. The network may comprisecopper transmission cables, optical transmission fibers, wirelesstransmission, routers, firewalls, switches, gateway computers and/oredge servers. A network adapter card or network interface in eachcomputing/processing device receives computer readable programinstructions from the network and forwards the computer readable programinstructions for storage in a computer readable storage medium withinthe respective computing/processing device.

Computer readable program instructions for carrying out operations ofthe present invention may be assembler instructions,instruction-set-architecture (ISA) instructions, machine instructions,machine dependent instructions, microcode, firmware instructions,state-setting data, or either source code or object code written in anycombination of one or more programming languages, including an objectoriented programming language such as Smalltalk, C++ or the like, andconventional procedural programming languages, such as the “C”programming language or similar programming languages. The computerreadable program instructions may execute entirely on the user'scomputer, partly on the user's computer, as a stand-alone softwarepackage, partly on the user's computer and partly on a remote computeror entirely on the remote computer or server. In the latter scenario,the remote computer may be connected to the user's computer through anytype of network, including a local area network (LAN) or a wide areanetwork (WAN), or the connection may be made to an external computer(for example, through the Internet using an Internet Service Provider).In some embodiments, electronic circuitry including, for example,programmable logic circuitry, field-programmable gate arrays (FPGA), orprogrammable logic arrays (PLA) may execute the computer readableprogram instructions by utilizing state information of the computerreadable program instructions to personalize the electronic circuitry,in order to perform aspects of the present invention.

Aspects of the present invention are described herein with reference toflowchart illustrations and/or block diagrams of methods, apparatus(systems), and computer program products according to embodiments of theinvention. It will be understood that each block of the flowchartillustrations and/or block diagrams, and combinations of blocks in theflowchart illustrations and/or block diagrams, can be implemented bycomputer readable program instructions.

These computer readable program instructions may be provided to aprocessor of a general purpose computer, special purpose computer, orother programmable data processing apparatus to produce a machine, suchthat the instructions, which execute via the processor of the computeror other programmable data processing apparatus, create means forimplementing the functions/acts specified in the flowchart and/or blockdiagram block or blocks. These computer readable program instructionsmay also be stored in a computer readable storage medium that can directa computer, a programmable data processing apparatus, and/or otherdevices to function in a particular manner, such that the computerreadable storage medium having instructions stored therein includes anarticle of manufacture including instructions which implement aspects ofthe function/act specified in the flowchart and/or block diagram blockor blocks.

The computer readable program instructions may also be loaded onto acomputer, other programmable data processing apparatus, or other deviceto cause a series of operational steps to be performed on the computer,other programmable apparatus or other device to produce a computerimplemented process, such that the instructions which execute on thecomputer, other programmable apparatus, or other device implement thefunctions/acts specified in the flowchart and/or block diagram block orblocks.

The flowchart and block diagrams in the Figures illustrate thearchitecture, functionality, and operation of possible implementationsof systems, methods, and computer program products according to variousembodiments of the present invention. In this regard, each block in theflowchart or block diagrams may represent a module, segment, or portionof instructions, which includes one or more executable instructions forimplementing the specified logical function(s). In some alternativeimplementations, the functions noted in the block may occur out of theorder noted in the figures. For example, two blocks shown in successionmay, in fact, be executed substantially concurrently, or the blocks maysometimes be executed in the reverse order, depending upon thefunctionality involved. It will also be noted that each block of theblock diagrams and/or flowchart illustration, and combinations of blocksin the block diagrams and/or flowchart illustration, can be implementedby special purpose hardware-based systems that perform the specifiedfunctions or acts or carry out combinations of special purpose hardwareand computer instructions.

Moreover, a system according to various embodiments may include aprocessor and logic integrated with and/or executable by the processor,the logic being configured to perform one or more of the process stepsrecited herein. By integrated with, what is meant is that the processorhas logic embedded therewith as hardware logic, such as an applicationspecific integrated circuit (ASIC), a FPGA, etc. By executable by theprocessor, what is meant is that the logic is hardware logic; softwarelogic such as firmware, part of an operating system, part of anapplication program; etc., or some combination of hardware and softwarelogic that is accessible by the processor and configured to cause theprocessor to perform some functionality upon execution by the processor.Software logic may be stored on local and/or remote memory of any memorytype, as known in the art. Any processor known in the art may be used,such as a software processor module and/or a hardware processor such asan ASIC, a FPGA, a central processing unit (CPU), an integrated circuit(IC), a graphics processing unit (GPU), etc.

It will be clear that the various features of the foregoing systemsand/or methodologies may be combined in any way, creating a plurality ofcombinations from the descriptions presented above.

It will be further appreciated that embodiments of the present inventionmay be provided in the form of a service deployed on behalf of acustomer to offer service on demand.

While various embodiments have been described above, it should beunderstood that they have been presented by way of example only, and notlimitation. Thus, the breadth and scope of a preferred embodiment shouldnot be limited by any of the above-described exemplary embodiments, butshould be defined only in accordance with the following claims and theirequivalents.

What is claimed is:
 1. A computer-implemented method, comprising:identifying a data write to a specific position within a virtual addressspace; determining an entry within a metadata structure that correspondsto the specific position within the virtual address space; and addingstate information associated with the data write to the entry within themetadata structure, the state information including a size of the datawrite within the virtual address space and an alignment of the datawrite within the virtual address space.
 2. The computer-implementedmethod of claim 1, wherein the virtual address space has an alignmentgranularity and a size granularity, where the size granularity isdifferent from the alignment granularity.
 3. The computer-implementedmethod of claim 1, wherein the metadata structure represents the virtualaddress space at a management layer of a system.
 4. Thecomputer-implemented method of claim 1, wherein the metadata structureincludes an array.
 5. The computer-implemented method of claim 1,wherein the metadata structure includes a plurality of entries, whereone or more entries within the metadata structure each correspond to aposition within the virtual address space where data is stored, and eachentry within the metadata structure corresponds to a portion of thevirtual address space having a size matching a size granularity of thevirtual address space.
 6. The computer-implemented method of claim 1,wherein each entry within the metadata structure includes an indicationas to whether a corresponding grain is in use, and a physical address ofthe corresponding grain.
 7. The computer-implemented method of claim 1,wherein the state information is added as two bits within the entry inthe metadata structure, where the two bits indicate a size of the entry,and whether the entry is aligned left or right within a portion ofvirtual address space represented by the entry within the metadatastructure.
 8. The computer-implemented method of claim 1, wherein thesize and the alignment of the data write are associated with a numberingof the entry.
 9. The computer-implemented method of claim 1, wherein thealignment includes a left alignment or a right alignment.
 10. Thecomputer-implemented method of claim 1, further comprising adjusting oneor more additional entries within the metadata structure, based on thedata write.
 11. The computer-implemented method of claim 1, wherein themetadata structure is subdivided into a plurality of self-containedgroups, where each self-contained group within the plurality ofself-contained groups does not have any entry carryover into adjacentgroups.
 12. A computer program product for representing an address spaceof unequal granularity and alignment, the computer program productcomprising a computer readable storage medium having programinstructions embodied therewith, wherein the computer readable storagemedium is not a transitory signal per se, the program instructionsexecutable by a processor to cause the processor to perform a methodcomprising: identifying a data write to a specific position within avirtual address space, utilizing the processor; determining an entrywithin a metadata structure that corresponds to the specific positionwithin the virtual address space, utilizing the processor; and adding,utilizing the processor, state information associated with the datawrite to the entry within the metadata structure, the state informationincluding a size of the data write within the virtual address space andan alignment of the data write within the virtual address space.
 13. Thecomputer program product of claim 12, wherein the virtual address spacehas an alignment granularity and a size granularity, where the sizegranularity is different from the alignment granularity.
 14. Thecomputer program product of claim 12, wherein the metadata structurerepresents the virtual address space at a management layer of a system.15. The computer program product of claim 12, wherein the metadatastructure includes an array.
 16. The computer program product of claim12, wherein the metadata structure includes a plurality of entries,where one or more entries within the metadata structure each correspondto a position within the virtual address space where data is stored, andeach entry within the metadata structure corresponds to a portion of thevirtual address space having a size matching a size granularity of thevirtual address space.
 17. The computer program product of claim 12,wherein each entry within the metadata structure includes an indicationas to whether a corresponding grain is in use, and a physical address ofthe corresponding grain.
 18. The computer program product of claim 12,wherein the state information is added as two bits within the entry inthe metadata structure, where the two bits indicate a size of the entry,and whether the entry is aligned left or right within a portion ofvirtual address space represented by the entry within the metadatastructure.
 19. The computer program product of claim 12, wherein thesize and the alignment of the data write are associated with a numberingof the entry.
 20. A system, comprising: a processor; and logicintegrated with the processor, executable by the processor, orintegrated with and executable by the processor, the logic beingconfigured to: identify a data write to a specific position within avirtual address space; determine an entry within a metadata structurethat corresponds to the specific position within the virtual addressspace; and add state information associated with the data write to theentry within the metadata structure, the state information including asize of the data write within the virtual address space and an alignmentof the data write within the virtual address space.